27 research outputs found

    StyleGAN as a Utility-Preserving Face De-identification Method

    Full text link
    Face de-identification methods have been proposed to preserve users' privacy by obscuring their faces. These methods, however, can degrade the quality of photos, and they usually do not preserve the utility of faces, i.e., their age, gender, pose, and facial expression. Recently, GANs, such as StyleGAN, have been proposed, which generate realistic, high-quality imaginary faces. In this paper, we investigate the use of StyleGAN in generating de-identified faces through style mixing. We examined this de-identification method for preserving utility and privacy by implementing several face detection, verification, and identification attacks and conducting a user study. The results from our extensive experiments, human evaluation, and comparison with two state-of-the-art methods, i.e., CIAGAN and DeepPrivacy, show that StyleGAN performs on par or better than these methods, preserving users' privacy and images' utility. In particular, the results of the machine learning-based experiments show that StyleGAN0-4 preserves utility better than CIAGAN and DeepPrivacy while preserving privacy at the same level. StyleGAN0-3 preserves utility at the same level while providing more privacy. In this paper, for the first time, we also performed a carefully designed user study to examine both privacy and utility-preserving properties of StyleGAN0-3, 0-4, and 0-5, as well as CIAGAN and DeepPrivacy from the human observers' perspectives. Our statistical tests showed that participants tend to verify and identify StyleGAN0-5 images more easily than DeepPrivacy images. All the methods but StyleGAN0-5 had significantly lower identification rates than CIAGAN. Regarding utility, as expected, StyleGAN0-5 performed significantly better in preserving some attributes. Among all methods, on average, participants believe gender has been preserved the most while naturalness has been preserved the least

    Impact of Stricter Content Moderation on Parler's Users' Discourse

    Full text link
    Social media platforms employ various content moderation techniques to remove harmful, offensive, and hate speech content. The moderation level varies across platforms; even over time, it can evolve in a platform. For example, Parler, a fringe social media platform popular among conservative users, was known to have the least restrictive moderation policies, claiming to have open discussion spaces for their users. However, after linking the 2021 US Capitol Riots and the activity of some groups on Parler, such as QAnon and Proud Boys, on January 12, 2021, Parler was removed from the Apple and Google App Store and suspended from Amazon Cloud hosting service. Parler would have to modify their moderation policies to return to these online stores. After a month of downtime, Parler was back online with a new set of user guidelines, which reflected stricter content moderation, especially regarding the \emph{hate speech} policy. In this paper, we studied the moderation changes performed by Parler and their effect on the toxicity of its content. We collected a large longitudinal Parler dataset with 17M parleys from 432K active users from February 2021 to January 2022, after its return to the Internet and App Store. To the best of our knowledge, this is the first study investigating the effectiveness of content moderation techniques using data-driven approaches and also the first Parler dataset after its brief hiatus. Our quasi-experimental time series analysis indicates that after the change in Parler's moderation, the severe forms of toxicity (above a threshold of 0.5) immediately decreased and sustained. In contrast, the trend did not change for less severe threats and insults (a threshold between 0.5 - 0.7). Finally, we found an increase in the factuality of the news sites being shared, as well as a decrease in the number of conspiracy or pseudoscience sources being shared

    Understanding the Bystander Effect on Toxic Twitter Conversations

    Full text link
    In this study, we explore the power of group dynamics to shape the toxicity of Twitter conversations. First, we examine how the presence of others in a conversation can potentially diffuse Twitter users' responsibility to address a toxic direct reply. Second, we examine whether the toxicity of the first direct reply to a toxic tweet in conversations establishes the group norms for subsequent replies. By doing so, we outline how bystanders and the tone of initial responses to a toxic reply are explanatory factors which affect whether others feel uninhibited to post their own abusive or derogatory replies. We test this premise by analyzing a random sample of more than 156k tweets belonging to ~9k conversations. Central to this work is the social psychological research on the "bystander effect" documenting that the presence of bystanders has the power to alter the dynamics of a social situation. If the first direct reply reaffirms the divisive tone, other replies may follow suit. We find evidence of a bystander effect, with our results showing that an increased number of users participating in the conversation before receiving a toxic tweet is negatively associated with the number of Twitter users who responded to the toxic reply in a non-toxic way. We also find that the initial responses to toxic tweets within conversations is of great importance. Posting a toxic reply immediately after a toxic comment is negatively associated with users posting non-toxic replies and Twitter conversations becoming increasingly toxic

    From Chatbots to PhishBots? -- Preventing Phishing scams created using ChatGPT, Google Bard and Claude

    Full text link
    The advanced capabilities of Large Language Models (LLMs) have made them invaluable across various applications, from conversational agents and content creation to data analysis, research, and innovation. However, their effectiveness and accessibility also render them susceptible to abuse for generating malicious content, including phishing attacks. This study explores the potential of using four popular commercially available LLMs - ChatGPT (GPT 3.5 Turbo), GPT 4, Claude and Bard to generate functional phishing attacks using a series of malicious prompts. We discover that these LLMs can generate both phishing emails and websites that can convincingly imitate well-known brands, and also deploy a range of evasive tactics for the latter to elude detection mechanisms employed by anti-phishing systems. Notably, these attacks can be generated using unmodified, or "vanilla," versions of these LLMs, without requiring any prior adversarial exploits such as jailbreaking. As a countermeasure, we build a BERT based automated detection tool that can be used for the early detection of malicious prompts to prevent LLMs from generating phishing content attaining an accuracy of 97\% for phishing website prompts, and 94\% for phishing email prompts

    POISED: Spotting Twitter Spam Off the Beaten Paths

    Get PDF
    Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

    User Engagement and the Toxicity of Tweets

    Full text link
    Twitter is one of the most popular online micro-blogging and social networking platforms. This platform allows individuals to freely express opinions and interact with others regardless of geographic barriers. However, with the good that online platforms offer, also comes the bad. Twitter and other social networking platforms have created new spaces for incivility. With the growing interest on the consequences of uncivil behavior online, understanding how a toxic comment impacts online interactions is imperative. We analyze a random sample of more than 85,300 Twitter conversations to examine differences between toxic and non-toxic conversations and the relationship between toxicity and user engagement. We find that toxic conversations, those with at least one toxic tweet, are longer but have fewer individual users contributing to the dialogue compared to the non-toxic conversations. However, within toxic conversations, toxicity is positively associated with more individual Twitter users participating in conversations. This suggests that overall, more visible conversations are more likely to include toxic replies. Additionally, we examine the sequencing of toxic tweets and its impact on conversations. Toxic tweets often occur as the main tweet or as the first reply, and lead to greater overall conversation toxicity. We also find a relationship between the toxicity of the first reply to a toxic tweet and the toxicity of the conversation, such that whether the first reply is toxic or non-toxic sets the stage for the overall toxicity of the conversation, following the idea that hate can beget hate

    Exploring Gender-Based Toxic Speech on Twitter in Context of the #MeToo movement: A Mixed Methods Approach

    Full text link
    The #MeToo movement has catalyzed widespread public discourse surrounding sexual harassment and assault, empowering survivors to share their stories and holding perpetrators accountable. While the movement has had a substantial and largely positive influence, this study aims to examine the potential negative consequences in the form of increased hostility against women and men on the social media platform Twitter. By analyzing tweets shared between October 2017 and January 2020 by more than 47.1k individuals who had either disclosed their own sexual abuse experiences on Twitter or engaged in discussions about the movement, we identify the overall increase in gender-based hostility towards both women and men since the start of the movement. We also monitor 16 pivotal real-life events that shaped the #MeToo movement to identify how these events may have amplified negative discussions targeting the opposite gender on Twitter. Furthermore, we conduct a thematic content analysis of a subset of gender-based hostile tweets, which helps us identify recurring themes and underlying motivations driving the expressions of anger and resentment from both men and women concerning the #MeToo movement. This study highlights the need for a nuanced understanding of the impact of social movements on online discourse and underscores the importance of addressing gender-based hostility in the digital sphere
    corecore